Goto

Collaborating Authors

 edge length



Topological Residual Asymmetry for Bivariate Causal Direction

arXiv.org Machine Learning

Inferring causal direction from purely observational bivariate data is fragile: many methods commit to a direction even in ambiguous or near non-identifiable regimes. We propose Topological Residual Asymmetry (TRA), a geometry-based criterion for additive-noise models. TRA compares the shapes of two cross-fitted regressor-residual clouds after rank-based copula standardization: in the correct direction, residuals are approximately independent, producing a two-dimensional bulk, while in the reverse direction -- especially under low noise -- the cloud concentrates near a one-dimensional tube. We quantify this bulk-tube contrast using a 0D persistent-homology functional, computed efficiently from Euclidean MST edge-length profiles. We prove consistency in a triangular-array small-noise regime, extend the method to fixed noise via a binned variant (TRA-s), and introduce TRA-C, a confounding-aware abstention rule calibrated by a Gaussian-copula plug-in bootstrap. Extensive experiments across many challenging synthetic and real-data scenarios demonstrate the method's superiority.


KNARsack: Teaching Neural Algorithmic Reasoners to Solve Pseudo-Polynomial Problems

arXiv.org Artificial Intelligence

Neural algorithmic reasoning (NAR) is a growing field that aims to embed algorithmic logic into neural networks by imitating classical algorithms. In this extended abstract, we detail our attempt to build a neural algorithmic reasoner that can solve Knapsack, a pseudo-polynomial problem bridging classical algorithms and combinatorial optimisation, but omitted in standard NAR benchmarks. Our neural algorithmic reasoner is designed to closely follow the two-phase pipeline for the Knapsack problem, which involves first constructing the dynamic programming table and then reconstructing the solution from it. The approach, which models intermediate states through dynamic programming supervision, achieves better generalization to larger problem instances than a direct-prediction baseline that attempts to select the optimal subset only from the problem inputs.


Efficient Optimization of a Permanent Magnet Array for a Stable 2D Trap

arXiv.org Artificial Intelligence

Untethered magnetic manipulation of biomedical millirobots has a high potential for minimally invasive surgical applications. However, it is still challenging to exert high actuation forces on the small robots over a large distance. Permanent magnets offer stronger magnetic torques and forces than electromagnetic coils, however, feedback control is more difficult. As proven by Earnshaw's theorem, it is not possible to achieve a stable magnetic trap in 3D by static permanent magnets. Here, we report a stable 2D magnetic force trap by an array of permanent magnets to control a millirobot. The trap is located in an open space with a tunable distance to the magnet array in the range of 20 - 120mm, which is relevant to human anatomical scales. The design is achieved by a novel GPU-accelerated optimization algorithm that uses mean squared error (MSE) and Adam optimizer to efficiently compute the optimal angles for any number of magnets in the array. The algorithm is verified using numerical simulation and physical experiments with an array of two magnets. A millirobot is successfully trapped and controlled to follow a complex trajectory. The algorithm demonstrates high scalability by optimizing the angles for 100 magnets in under three seconds. Moreover, the optimization workflow can be adapted to optimize a permanent magnet array to achieve the desired force vector fields.


Decoding Positive Selection in Mycobacterium tuberculosis with Phylogeny-Guided Graph Attention Models

arXiv.org Artificial Intelligence

Positive selection drives the emergence of adaptive mutations in Mycobacterium tuberculosis, shaping drug resistance, transmissibility, and virulence. Phylogenetic trees capture evolutionary relationships among isolates and provide a natural framework for detecting such adaptive signals. We present a phylogeny-guided graph attention network (GAT) approach, introducing a method for converting SNP-annotated phylogenetic trees into graph structures suitable for neural network analysis. Using 500 M. tuberculosis isolates from four major lineages and 249 single-nucleotide variants (84 resistance-associated and 165 neutral) across 61 drug-resistance genes, we constructed graphs where nodes represented isolates and edges reflected phylogenetic distances. Edges between isolates separated by more than seven internal nodes were pruned to emphasise local evolutionary structure. Node features encoded SNP presence or absence, and the GAT architecture included two attention layers, a residual connection, global attention pooling, and a multilayer perceptron classifier. The model achieved an accuracy of 0.88 on a held-out test set and, when applied to 146 WHO-classified "uncertain" variants, identified 41 candidates with convergent emergence across multiple lineages, consistent with adaptive evolution. This work demonstrates the feasibility of transforming phylogenies into GNN-compatible structures and highlights attention-based models as effective tools for detecting positive selection, aiding genomic surveillance and variant prioritisation.


e5f6ad6ce374177eef023bf5d0c018b6-Reviews.html

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper develops a model for multifurcating trees with edge lengths and observed data at the tree leaves; the model is based on the beta coalescent from the probability literature. The authors develop an MCMC inference scheme for their model, in which they draw on existing work that uses belief propagation to perform inference for the Kingman coalescent (an edge case of the beta coalescent in which all trees are binary). The particular challenge for inference here is that there are many more possible parent-child node relationships when parents can have multiple children (not just two). The authors seem to use a Dirichlet Process mixture model (DPMM) at each node to narrow down the space of possible children subsets to consider. As the authors note, even inference with the Kingman coalescent is a hard problem. In experiments, they compare to the Kingman coalescent and hierarchical agglomerative clustering. The Kingman coalescent is a popular modeling tool, so it is great to see a practical extension of the Kingman coalescent to the multifurcating case being explored for inference.




Finding Closure: A Closer Look at the Gestalt Law of Closure in Convolutional Neural Networks

arXiv.org Artificial Intelligence

The human brain has an inherent ability to fill in gaps to perceive figures as complete wholes, even when parts are missing or fragmented. This phenomenon is known as Closure in psychology, one of the Gestalt laws of perceptual organization, explaining how the human brain interprets visual stimuli. Given the importance of Closure for human object recognition, we investigate whether neural networks rely on a similar mechanism. Exploring this crucial human visual skill in neural networks has the potential to highlight their comparability to humans. Recent studies have examined the Closure effect in neural networks. However, they typically focus on a limited selection of Convolutional Neural Networks (CNNs) and have not reached a consensus on their capability to perform Closure. To address these gaps, we present a systematic framework for investigating the Closure principle in neural networks. We introduce well-curated datasets designed to test for Closure effects, including both modal and amodal completion. We then conduct experiments on various CNNs employing different measurements. Our comprehensive analysis reveals that VGG16 and DenseNet-121 exhibit the Closure effect, while other CNNs show variable results. We interpret these findings by blending insights from psychology and neural network research, offering a unique perspective that enhances transparency in understanding neural networks. Our code and dataset will be made available on GitHub.


Binary to Bushy: Bayesian Hierarchical Clustering with the Beta Coalescent, Jordan Boyd-Graber 2, Hal Daumè III 3, Z. Irene Ying

Neural Information Processing Systems

Discovering hierarchical regularities in data is a key problem in interacting with large datasets, modeling cognition, and encoding knowledge. A previous Bayesian solution--Kingman's coalescent--provides a probabilistic model for data represented as a binary tree. Unfortunately, this is inappropriate for data better described by bushier trees. We generalize an existing belief propagation framework of Kingman's coalescent to the beta coalescent, which models a wider range of tree structures. Because of the complex combinatorial search over possible structures, we develop new sampling schemes using sequential Monte Carlo and Dirichlet process mixture models, which render inference efficient and tractable. We present results on synthetic and real data that show the beta coalescent outperforms Kingman's coalescent and is qualitatively better at capturing data in bushy hierarchies.